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(54) Method and apparatus for maintaining packet order integrity in a parallel switching engine 



(57) A method and apparatus for maintaining packet 
order integrity in a switching engine wherein inbound 
packets are forwarded to different ones of parallel 
processing elements for switching. Order preservation 
for packets relating to the same conversation is guaran- 
teed by checking for each inbound packet whether a 



previous packet from the same source is pending at a 
processing element and, if the check reveals that such a 
packet is pending, forwarding the inbound packet to the 
same processing element as the previous packet. 



INPUT 
UNIT 



Figure 1 



OUTPUT 
UNIT - 




120 



SWITCH 
FABRIC 




L 








» 















tL 



122 














/ 


/ 


PHYSICAL 




OUTPUT 



CD 



INPUT BUFFER 
PROCESSOR ARRAY MODULE 



OUTPUT 
BUFFER 



Q. 
O 



CD 

o 

O 



Printed by Xerox (UK) Business Services 
2.16.7 (HRS)/3.6 



r <Ep _1061695A2J_> 



1 EP 1 061 695 A2 

Description ance 



BACKGROUND OF THE INVENTION 

[0001] The present invention relates to data switch- 5 
ing, and more particularly to data switching engines of 
the kind in which a processor array is used to switch 
data from a plurality of sources to a plurality of destina- 
tions in a data communication network. 
[0002] In recent years, high-speed data communi- w 
cation switching has been accomplished mostly in appli- 
cation specific integrated circuits (ASICs). 
Programmable logic devices have generally been con- 
sidered too slow to be relied upon as main switching 
engines. With recent improvements in programmable is 
logic technology, however, a trend now appears to be 
emerging toward implementing multiple programmable 
logic devices in parallel, or parallel processor arrays, as 
primary data switching engines. 

[0003] Processor array switching engines provide 20 
certain advantages over ASIC switching engines in 
terms of time-to-market, flexibility and scalability. Still, 
the "parallel" aspect of processor array switching 
engines creates technical challenges. Foremost among 
these is how to best allocate the resources of the array. 25 
One possibility is to strictly dedicate each processor in 
the array to a particular group of sources. However, 
such a dedicated processor array is inefficient since a 
processor is idle whenever the sources to which it is 
dedicated are not transmitting packets, even while other 30 
processors may be overburdened. A second possibility 
is to allow each processor in the array to be shared by 
all sources. Such a shared processor array might 
greatly increase overall switching efficiency, especially 
when implemented in conjunction with an efficient load 35 
balancing algorithm ensuring that inbound packets are 
transmitted to the processors presently being underuti- 
lized. However, a shared processor array gives rise to 
other problems, such as how to preserve packet order 
integrity. 40 
[0004] A problem of preserving packet order integ- 
rity arises in shared processor arrays because at any 
given time in the operational cycle of such an array, the 
time required to process a packet will vary from proces- 
sor-to-processor. Thus, packets may be switched out of 45 
the array in an order different from that in which they 
were transmitted to the array for switching. While a 
departure from strict "first in, first out" sequencing is not 
a problem for packets applicable to different conversa- 
tions, it may be for packets applicable to the same con- so 
versation. 

[0005] Accordingly, there is a need for a way to 
ensure in a processor array in which the processing ele- 
ments are shared among ail sources that packets from 
the same source leave the array in the sequence in 55 
which they arrived. And there is a need for preserving 
packet ordering for packets from a common source 
without imposing too high a tax on switching perform- 



SUMMARY OF THE INVENTION 

[0006] The present invention provides a method 
and apparatus for preserving packet order integrity in a 
shared processor array. The order for packets relating to 
the same conversation is maintained by checking for 
each inbound packet whether a previous packet from 
the same source is pending at a processing element 
before forwarding the packet to the processor array. If 
the check reveals that such a packet is pending, the 
inbound packet is forwarded to the same processing 
element as the previous packet. If the check reveals that 
no packet from the same source is pending at any 
processing element, the inbound packet is forwarded to 
a processing element in accordance with a load balanc- 
ing algorithm. 

[0007] The present invention may be better under- 
stood by reference to the following detailed description, 
taken in conjunction with the accompanying drawings 
which are briefly described below. Of course, the actual 
scope of the invention is defined by the appended 
claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0008] 

Figure 1 illustrates a portion of a data communica- 
tion switching architecture; 

Figure 2 illustrates the processor array module of 
Figure 1 in greater detail; 

Figure 3 illustrates the control lines for control flows 
between an input controller and a processing ele- 
ment of Figure 2; 

Figure 4 illustrates an input controller of Figure 2 in 
greater detail; 

Figure 5 illustrates the format of a bit mask stored in 
the PE mask register of Figure 4; 
Figure 6 illustrates a processing element of Figure 
2 in greater detail; 

Figure 7 illustrates the format of backlog registers 
of Figure 6 in greater detail; 

Figure 8 is a flow diagram illustrating a check per- 
formed at an input controller of Figure 2 before for- 
warding a packet to a processing element; 
Figure 9 is a flow diagram illustrating a packet back- 
log update function performed at a processing ele- 
ment of Figure 2; and 

Figure 10 is a flow diagram illustrating a packet 
backlog update and bit mask reset function per- 
formed at processing element of Figure 2. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

[0009] In Figure 1, an input unit 110 and output unit 
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120 for a data switching architecture are shown. In the 
complete architecture (not shown), one or more input 
units and one or more output units are coupled via a 
switch fabric 130 such that every physical input (and its 
associated input unit) may transfer data to every physi- 
cal output (and its associated output unit). At any given 
instant in time, a subset (or all) of input units receive 
data destined for a subset (or all) output units. Data may 
therefore be imagined as flowing from left to right, for 
example, from input unit 110 to output unit 120. Each 
input unit has a plurality of physical inputs and each out- 
put unit has one or more physical outputs. Data may be 
transmitted through the architecture in variable length 
packets, in fixed length cells, or both, although any such 
discrete unit of data will be referred to as a "packet" 
herein for clarity and consistency. In a preferred switch- 
ing operation between input unit 110 and output unit 
1 20, a packet received on one of the physical inputs and 
destined for the physical output arrives at processor 
array module 112 and is switched and forwarded to 
input buffer 114. The packet is eventually released on 
fabric data bus 142 to switching fabric 130 and arrives at 
output buffer 122, where the packet remains stored until 
eventual delivery on the physical output. 
[0010] In Figure 2, processor array module 112 is 
shown in greater detail. Processor array module 1 12 is 
responsible for switching packets from physical inputs to 
physical outputs. In its most basic feature, the switching 
operation performed in module 1 12 involves interpreting 
and modifying packet control fields, such addresses 
encoded in packet headers, to ensure delivery of pack- 
ets on appropriate physical outputs. Processor array 
module 1 12 has M input controllers 210 associated with 
physical inputs. Input control units 210 are coupled via L 
module data buses 220 to processor array 230 which 
includes L processing elements. Module data buses 
220 are arranged such that each processing element 
receives data on a particular bus and each input control- 
ler may transmit data on each bus. Processing elements 
230 share external data bus 142 for forwarding packets 
to input buffer 1 14 after the packet switching operation 
has been completed. 

[0011] An important object of the invention is to 
implement a shared processor array which preserves 
the sequence of packets applicable to the same conver- 
sation. This preservation of packet order integrity is 
achieved in a preferred embodiment by implementing a 
"commit-and-release" protocol. Input controllers 210 
deliver packets from different sources to processor 
array 230. Uncommitted sources become committed to 
a particular processing element in array 230 upon for- 
warding an inbound packet from the source to the ele- 
ment. All subsequent inbound packets from the source 
are forwarded to the element while the commitment is in 
effect. The commitment is terminated after all packets 
from the source have been switched out of the array. 
This "commit-and-release" protocol guarantees that 
packets applicable to the same conversation are 



switched out of a shared array processor array in their 
order of arrival without unduly hindering switching per- 
formance. In furtherance of this basic inventive feature, 
input controllers 210 and processing elements in array 

5 230 are coupled by control lines. Turning to Figure 3, a 
representative input controller 310 and processing ele- 
ment 320 are coupled by mask reset line 322 and back- 
log update line 324. Processing element 320 invokes 
lines 322, 324 to provide feedback to input controller 

70 31 0 about current conditions at element 320 which con- 
troller 31 0 must know to correctly decide which process- 
ing element within array 230 to select when forwarding 
inbound packets. Particularly, mask reset line 322 is 
invoked to instruct controller 310 that element 320 has 

is no more packets pending from the source. This instruc- 
tion in effect releases the source from a previous com- 
mitment to element 320 so that a processing element 
for the next inbound packet from the source may be 
selected on the basis of efficiency, rather than selecting 

20 element 320 out of concern for preserving packet order. 
Backlog update line 324 is invoked to inform controller 
310 about the current backlog of packets pending in ele- 
ment 320 from sources associated with all input control- 
lers 210. When a source associated with controller 310 

25 is in the uncommitted state, controller 310 compares 
backlog information provided by all processing ele- 
ments in array 230 to assess the relative efficiency of 
forwarding inbound packets from the source to element 
320. 

30 [0012] The operation of processor array module 
210 will now be described in even more detail by refer- 
ence to Figures 4-1 0. Referring first to Figure 4, a repre- 
sentative input controller 400 is illustrated. Inbound 
packets arrive at controller 400 on physical input IPJN 

35 and are written to input queue 404 and write address 
counter 406 is incremented. PE resolve logic 412 moni- 
tors write address counter 406 and read address coun- 
ter 410. When an inbound packet is pending in queue 
404, PE resolve logic 412 selects a processing element 

40 and transmits a packet release request to the to the con- 
trol logic element PE_X_BUS control logic 424 for the 
module data bus PE_X_BUS on which the selected 
processing element listens. Eventually, logic 424 grants 
the request. AND gates 414 are enabled and the 

45 inbound packet is read from queue 404 and transmitted 
along with a source identifier retrieved from source port 
ID register 402 on the bus PE_X_BUS to the selected 
processing element It bears noting that although in the 
illustrated embodiment controller 400 has only one 

so physical input, in other embodiments controller may 
have one or more physical inputs. Moreover, while in the 
illustrated embodiment all inbound packets arriving at 
controller 400 on the physical input are attributed to the 
same source, in other embodiments inbound packets 

55 arriving at a controller on a common physical input but 
having different source addresses may be attributed to 
different sources. 

[0013] In order to make a correct processing ele- 



EP 1061695A2J_> 



EP 1 061 695 A2 



ISiSTf" *" inbound packet - PE reso,ve 'ogic 

211 n Z' neS Wh6,her ^ S0UrCS f0f the inb °-« 
Setermil^ """^ ° f uncom "*ted state. This 
^termination is assisted by a bit mask retained in PE 

in re 2T 4 ° a The format 500 °' the mask 
n PE mask reg, S ter 408 is shown in Figure 5. Each of 

the L processing elements active in processor array 230 

«s as s) gned a bit position within the mask The mask is 

b* in 5T P ^ reS ° ,Ve '° 9iC 412 ° n mask "-El ^ a 

ted toT maSk " S6t ' S ° UrCe is CU ™«y fit- 
ted to the processing element whose bit is set and PE 

resolve logic 412 se ,ects that processing element. If no 
brt m the mask .s set, however, the source is currently 
uncommrtted and logic 412 may select a processing 
element on the basis of efficiency. PE resolve logic 4 Tf 
J Tl,^ C ° mpareS baCk '° 9 Nation received 

424 a nH P T t SS ': 9 e,ementS ° n backl °9 r>™ 
424 and selecte the processing element whose backlog 

is at present lowest Of course, other load balancing 
algornhms are possible in which factors other tha cur 
rent backlog are determinative when selecting a 
processmg element for inbound packets from uncom 
mitted sources. 

ES1 4I h S ° UrCeS are SWitChed between tne commit- 
ted and uncommitted state by setting and resetting the 
mask ,n PE mask register 408. The mask is set when a 

ZT?T f ° r 3n "fitted source is 

selected on the basis of efficiency. Particularly the bit 

s!L1h 8 maSk iS rSSet when the Piously 

selected processing element transmits a reset instruc 
on to controlle 40Q after me |as{ ^ ns £c 

the SW ' tChed 0Ut ° f the 6,ement - 

the eset mstruction is transmitted on the one of mask 

to be ^t ^ driV6n bV 6lemem CaUSin 9 the ™ ak 
[0015] Referring now to Figure 6. a representative 
processmg element 600 is illustrated in more detaif 
Packets (including the source identifier) arrive off a 
modute bus (e.g PE_X_BUS) at element 600 and are 

st SofftH ' ? ParSin9 Unit 61 °- Pa ™9 u "it 
str psoff the packet header (including the source identi- 
fier) and deposits the inbound header in header buffer 
630. The packet payload flows to data buffer 640 Proc 
essor logic 620 reviews the inbound header and con- 
verts the .nbound header into an outbound header 
sufficient to ensure delivery of packets on appropriate 
phys,ca outputs. The outbound header and payloaTfo, 
the packet are eventualfy released to packet reasTem 
Wy unrt 650 where the outbound header is reassembled 
with the payload and the reassembled packet is 
released on external data bus 142 
ion.? „ Processin 9 elem ent 600 monitors the back- 
Zt^T Pendin9 3t e ' ement 600 ,rom eac " source 

ot, o h h „ eXPedient ° f baCk, ° 9 re 9 is ^ 700. The 
form of backlog registers 700 is shown in Figure 7 A 
bac k | 0 g register js assjgned to each Qf 9 A 

act.ve ,n processor array module 210. Each register 
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jams a backlog count reflecting the current number of 
packets pending at element 600 from a particular 
source. For every packet from a particular source Si 

5 r e b s ac a kio e r ent 6o °- processor ,o9ic 620 in ---ts 

5 *e backlog count in the register reserved for the 
source. For every packet from a particular source which 

the backlog count in the register reserved for the 

,o fronTon ? baCk '° 9 C ° Unt " 3 register fe Ceremented 
tZto ft! Zer °: 6lement 600 transmtts a reset instruc- 
Zll ? ■7" ? ntr0 " er associa, ed with the source 
whose backlog value reverted to zero on the apprcpri- 
ate one of mask reset lines 622 to release the source 

ZL! r m ,7 ,tment t0 e ' ement 60a Beme "t 600 also 
,5 nvokes backlog update line 624 on a regular basis to 
inform all input controller of the current aggregate! • 
all source) backlog count at element 600 in orde to 
provide uncommitted sources an updated view of the 

*, S5T, ^T 9 31 e ' emente When Electing eTe 
20 ments for mbound packets on the basis of efficiency. 

[0017] Turning nowto Figure 8, a flow diagram illus- 

2 tr; 6Ck Parf0rmad at an ■*« controlled, Figure 
2 before forwarding an inbound packet to a processing 
element. The packet is received at the input contS 
1 2 t P yS ' Cal inPUt (S1 0) and the bit ™* « read S 

^830 TaK ,0n i S made WhSther 3 «* in the mask "mS 
(830) If a brt ,n the mask is set, the source is presently 

committed and the packet is forwarded to the pSess 

!"9 a 'ementwhosebitis S ettoobviateanypacke?rdS- 
» mg problem (840). If, however, no brt in the mask °s se 
the source ,s not presently committed. AccordTn^' 

erenced and the packet is forwarded to the processino 
element having the lowest backlog (834) PrfortoS 

tZT' h0W6Ver ' the » in the m ask ^Zifort 
se ected prccessing e.ement is set to commit the source 
to that element (832). 

[0018] Referring to Figure 9, a flow diagram illus- 
trates a packet backlog update function performed Ta 
- processing e.ement of Figure 2. The packet is^ed 
at a processing element from a module data bus (91 0) 
The source is identified (920) and the backlog count for 
the identified source is incremented (930) 

45 Referrin9 fina " y t0 Figure 1 °- a flow diagram 

funoton performed at a processing element of Figure 2 
The packet ,s retrieved from a buffer (1010) and the 
source is identified (1020). The backlog oount for he 
identrfied source is decremented (1030). A check is 

1040). If the new backlog count is zero, the bit mask for 
the identified source is reset (1 050) 

2F2 the an be ap P reciated ^ those of ordinary 
55 oth! c V". at the ' nvention can be embodied h 
55 other specrnc forms without departing from the soirft or 
essentia, character hereof. The descHbed emLdimen 

SsSr;; a " respects considered inustrative s 

restr,ct,ve..The scope of the invention is defined by the 
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appended claims, and all changes that come within the 
range of equivalents thereof are intended to be 
embraced therein. 

Claims 

1. A method for preserving packet order integrity in a 
switching engine wherein packets are forwarded to 
different ones of a plurality of processing elements 
within a processor array for switching, comprising 
checking for each inbound packet whether another 
packet from the same source is pending within the 
processor array before forwarding the inbound 
packet to the processor array. 

2. A method for preserving packet order integrity in a 
switching engine wherein packets are forwarded to 
different ones of a plurality of parallel processing 
elements for switching, comprising: 

checking for an inbound packet whether 
another packet from the same source is pend- 
ing at any processing element; and 
if another packet from the same source is 
pending at any processing element, forwarding 
the inbound packet to the processing element 
at which the other packet from the same source 
is pending. 

3. The method according to claim 2, further compris- 
ing: 

if no other packet from the same source is 
pending at any processing element, forwarding 
the inbound packet to a processing element 
selected in accordance with a predetermined 
algorithm. 

4. The method according to claim 2, further compris- 
ing: 

if no other packet from the same source is 
pending at any processing element, forwarding 
the inbound packet to a processing element 
selected in accordance with a predetermined 
load balancing algorithm. 

5. The method according to claim 2, further compris- 
ing: 

if no other packet from the same source is 
pending at any processing element, forwarding 
the inbound packet to the processing element 
having the lowest backlog of packets. 

6. A method for preserving packet order integrity in a 
switching engine wherein packets are forwarded to 
different ones of a plurality of processing elements 



within a processor array for switching, comprising 
checking for each inbound packet whether a source 
of the packet is committed to a processing element 
within the processor array before forwarding the 
5 inbound packet to the processor array. 

7. A method for preserving packet order integrity in a 
switching engine wherein packets are forwarded to 
different ones of a plurality of processing elements 

10 within a processor array for switching, comprising: 

determining if a source of an inbound packet is 
committed to any processing element; and 
if the source of the inbound packet is commit- 
75 ted to any processing element, forwarding the 

inbound packet to the processing element to 
which the source is committed. 

8. The method according to claim 7, further compris- 
20 ing: 

if the source of the inbound packet is not com- 
mitted to any processing element, forwarding 
the inbound packet to a processing element 
25 selected in accordance with a predetermined 

algorithm. 

9. The method according to claim 7, further compris- 
ing: 

30 

if the source of the inbound packet is not com- 
mitted to any processing element, forwarding 
the inbound packet to a processing element 
selected in accordance with a predetermined 
35 load balancing algorithm. 

10. The method according to claim 7, further compris- 
ing: 

40 if the source of the inbound packet is not com- 

mitted to any processing element, forwarding 
the inbound packet to the processing element 
having the lowest backlog of packets. 

45 11. In a switching engine wherein packets from differ- 
ent sources are forwarded to different ones of a plu- 
rality of parallel processing elements for switching, 
a method for preserving packet order integrity, com- 
prising: 

50 

committing a source to a processing element; 
and 

forwarding inbound packets from the source to 
the processing element and no other process- 
55 ing element until the commitment is released. 

12. The method according to claim 11, wherein the 
commitment is released when all the inbound pack- 
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ets from the source have been switched from the 
processing element 
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